DtDobb's 



JOURNAL 



FEATURES 



SOFTWARE 



roots FORM 



PROFESSIONAL 



PR06RAMMER 



C 0 N Tl 



OBJECT PERSISTENCE: BEYOND SERIALIZATION 

by Timo Salo, Justin Hill, Scott Rich, Chuck Bridgham, and Daniel Berg 
Our authors describe techniques and frameworks necessaryto successfully implement scalable 
object persistence for complex database systems. Much of the technology they examine has been 
incorporated in development tools ranging from VisualAge for Java, to EJB tools for WebSphere. 

JAVA PROXIES FOR DATABASE OBJECTS 

by Paul Upton .... .... 

Java proxy technology lets you define database object schema using the database ODL To illustrate 
how such a technology might be implemented, Paul provides examples based on the Jasmine ; ; 
object-oriented database. V. . ' 

VBSCRIPT AND SQL CALENDARS 

by John Donovan Lambert 

John presents the VBScripts he uses for inputting SQL results into a web calendar, and oteaJSsesS; 
how you can port these scripts to Java, Perl, Cold Fusion, or whatever language you prefer: . 

THE CVS DAIA FORMAT 

by Cesar A. Gonzalez Perez 

The CVS data format stores cartographic data for a specific geographic area Into a single file. Cesar 
examines the format, then presents a tool for converting CVS files into DXF fpfrnat 

AGENT ITINERARIES . v - \ 

by Russell P. Lentini, Goutham P. Rao, and Jon N. Thies 

Instead of examining itineraries in the traditional way as a list of tasks to be performed by agents/\ 
our authors treat itineraries as a metaprogram— a way of programming ah agent and inadvertently 
its goal. To illustrate, they'll present an itinerary that performs a database query. 

JAVA AND DIGITAL IMAGES 

by David H. Martin and Johnny Martin 

Capturing, storing, and retrieving images is an often- overlooked i&mxz that many applications 
could benefit from. David and Johnny describe "Grabber for Java," an API that encapsulates the 
functionality necessary for video capture. . v 



EMBEDDED SYSTEMS 



19 



34 



40 



50 



60 



72 





THE SPARK REAL-TIME KERNEL 

by Anatoly Kotlarsky 

SPARK, short for "Small Portable Adjustable Real-time Kernel," is a royalty-free, fast, tiny, 
portable real-time kernel. Anatoly describes how he used it to build a video bar-code 
scanner. ' . 



INTERNET PROGRAMMING 



AUTOMATED TESTING FOR WEB AITUCATIONS 

by M. Selvakumar i : : 

The technique for automated web-user-interface testing presented here is based on HTML, 

JavaScript, and CGI, and implemented for Netscape Communicator 4.04 and Apache 1.2. 



88 



DR. DOBB'S JOURNAL (ISSN 1044-789X) is published monthly by Miller Freeman; Inc., 6W*Hahison Street, San Francisco, CA 94017; 415-905-2200. Penodicals Postage Paid at San Francisco 
and at additional mailing offices. SUBSCRIPTION: $34 95 for 1 year; $69-90 for 2 years mterrfcipnal orders must be prepaid Payment may be made via Mastercard, Visa, or American 
Express, or via U.S. funds drawn on a U.S. bank. Canada and Mexico: $4-5.00 per year. All oma-Toreign: $70.00 per year. POSTMASTER: Send address changes to Dr. Dobb's Journal 
P.O. Box 56188, Boulder, CO 80328-6188. GST (Canada) #R124771239 Canada P&t^tefcn>na) fyblications Mail Product (Canadian Distribution) Sales Agreement No. 0548677. 
FOREIGN NEWSSTAND DISTRIBUTOR: Worldwide Media Service Inc., 30 Montgomery St., jeeey-Ciry, NJ 07302; 212-332-7100. Entire contents © 1999 by Miller Freeman, Inc., unless 
otherwise noted on specific articles. All rights reserved.— Standard mail (A) enclosed in versions 3C and 3D* 

4 V V.:r :; ^ 1 Dr. Dobb's Journal, May 1999 



E N T S 



MAY 1999 
VOLUME 24, ISSUE 5 





100 



PROGRAMME R 7 S TO P LC H EST 

THE VERSION CONTROL PROCESS 

by Aspi Havewala 

Source-code version control is a set of working rules for code sharing that lets 
developers modify files in an exclusive way. As such, it is one of the most 
important, yet least understood, areas of software development. 



COLUMNS 

C PROGRAMMING 115 

by Al Stevens. 

Al ponders the question, "What's in an argvT and speculates on why the 
answer is different for DOS and UNIX developers. 

JAVAQ&A 121 

by Lou Grinzo 

How do you run untrusted classes? Lou takes a look at a couple of different 
answers to this question. s 

ALGORITHM ALLEY 125 

by Jon Bentley 

Last month, Jon presented techniques for analyzing the performance of 
algorithms. This month, he examines how code-tuning techniques speed up 
the various algorithms. 

DR. ECCO'S OMNIHEURIST CORNER 130 

by Dennis E. Shasha 

, Dr. Ecco joins forces with the NSA, FBI, and other crime- stoppers to help fight 
web terrorism. . 

PROGRAMMER'S BOOKSHELF 133 

by Gregory V. Wilson and William Stallings 
Greg examines Component Software: Beyond Object-Oriented Programming, 
by Clemens Szyperski, while William takes a look at Neil J. Gunther's The 
Practical Performance Analyst: Performance-By-Design Techniques for 
Distribufed Systems, 



Dr. Dobb's Journal May 1999 








— j "f".'' 1 — - — : r — ,• - ' 






















F 0 R U 


M 



EDITORIAL 


8 


by Jonathan Erickson 




LETTERS 


to 






NEWS & VIEWS 


16 


6yrfoc DDJstoj^ 




OF INTEREST 


142 


by Eugene Eric Kim 




SWMNE'S FLAMES 


,144 


by Michael Swairie 





RESOURCE (ENTER 

As a service to our readers, source code and 
related files, and author guidelines are available 
at http://www.ddj.com/. Source code is also 
available via anonymous FTP from ftp.ddj.com/ 
(199.125.85.76). Letters tome editor, article 
pfoposais/submissions, and inquiries can be sent . 
to editors@ddj.cbm/, faxed to 650-358-9749, or 
mailed to Dr. Dobb's Journal, 411 Borel Ave., 
Suite 100, San Mateo, CA 94402-3522. 

For subscription questions, change of address, 
and orders, call 800-456-1215 (U.S. or Canada). . 
For all other countries, call 303-678^8475 of fax 
303-661-1181. E-mail subscription questions to 
ddj@neodata.com/ or write to Dr. Dobb's journal, 
P.O. Box 56188, Boulder, CO 80322-6188. 

Back issues may be purchased for $9.00 per 
copy (includes shipping and handling). For issue 
availability, send e-mail to orders@mfLcom/, fax 
to 785-841-2624, or call 80044^4881 (U.S. and 
Canada) or 785-838-7500 (all other countries). 
Back issue orders must be prepaid. Please send 
payment to Dr. Dobb's Journal, 1601 W. 23rd 
Street, Suite 200, Lawrence. KS 66046-2700.. . 

Individual back articles may be purchased 
electronically at http://www.ddj.com/ as ZIP 
archives. 

NEXT MONTH 

In June, we examine the topic of 
object-oriented design, and announce the 
recipients of the 1999 Dr. Dobb's 
Excellence in Programming Awards. 



NOTICE: This materia' may be protected 

by copyright law (Me 17 U.S. Cede). 
mm by the University of Washington ;tlra| 




ilttil 



Increasing productivity 
and reducing maintenance 

Tirno Solo, Justin Hill, Scott Rich, Chuck Bridghom, and Daniel Berg 



■ ■ ost commerciaL high- volume databases are based on ei- 
llll ther the relational .or service paradigm (that is, databas- 
1111 es encapsulated within transaction processing monitors). 
1 V I Persisting objects, in these nonobject- oriented databases 
is a major challenge when building large-scale applications. 

On a small scale, object persistence is easy to solve. Seri- 
alization; for example, has been presented as a method for 
providing simple object persistence. However, scaling up in- 
troduces a new set of requirements. Many enterprise object 
systems involve object models with complex inheritance hi- 
erarchies and large numbers of object relationships. The run- 
time configuration often includes multiuser databases that can 
be both relational and nonrelational. The object model and 
database model are often designed by different groups of peo- 
ple, therefore requiring a loose coupling between the mod- 

The authors are software engineers working in IBM's Visual- 
Age Features Development group. They can be contacted at 
tjsalo@us.ibm.com. 



els ' The design of a scalable object persistence framework 
must adequately address issues related to performance with 
complex object models, support for complex object transac- 
tions, transformations from object inheritance structures and 
associations to native database structures, translating object 
queries to native database queries, and accessing objects across 
multiple database paradigms. 

There are several standards and specifications related to ob- 
ject databases and object persistence, including the Object Man- 
agement Group (OMG) Standard, Object Database Manage- 
ment Group (ODMG) Standard, and Enterprise JavaBeans (EJB) 
Specification. However, none of these specifications address 
the actual implementation of a persistence engine. At best they 
describe interfaces and high-level components that form the 
API of the system. 

In this article, we'll describe techniques and frameworks re- 
quired to successfully implement scalable object persistence 
for complex systems. We'll address topics such as required 

(continued on page 22) 



Dr. Dobb's Journal, May 1999 



1? 



Development Environment 
Browsers . object Models 
Model 




Maps 

<J=j> E&naEgiBta' r-S 



Database 
Schema 




Run-Time Environment 

Transactions 



; Object 
Versions 



Business 
Objects & 
Relationships 



Homes 

Data Access 
Services 



Figure Ir High -level architecture for a persistence framework, 
(continued from page 19) 

metainformation, read-ahead and caching, queries, object as- 
sociations, and concurrent and nested transactions. We have, 
pioneered these techniques for almost 10 years in many large- 
scale projects. Various aspects of the technology we describe 
have been incorporated in IBM development tools, including 
VisualAge for Java (Persistence Builder), VisualAge for Smalltalk 
(ObjectExtender), and EJB development tools for WebSphere. 

General Architectures 

Persistence frameworks typically consist of two high-level 
components: the development- time toolkit and the run-time 
persistence engine. Figure 1 is an example of high-level archi- 
tecture for a persistence framework. 

The development toolkit usually includes tools for collecting 
metainformauon about the object model and database, and tools 
for generating business object classes and database queries. 

There are two approaches for implementing the run-time 
engine. One approach is to have the metainformation avail- 
able at run time, and generate the queries for retrieving ob- 
jects on- the- fly as the application traverses various object re- 
lationships. This approach makes it possible to build dynamic, 
flexible applications that have no navigation restrictions with- 
in the object model. However, the amount of memory used 
by the metainformation and run- time query generation usu- 
ally results in poorer performance. Another approach es to 
generate the queries at development time. Little explicit metain- 
formation is needed at run time with this approach. Execu- 
tion of the generated queries is faster, because run-time in- 
ferencing is not needed and the queries can often be optimized 
for the database. The drawback is that the object model traver- 



Persistent 
Object Model 



Object-to-Data 
Mapping Model 



Data Model 



EES] [132] ESS 
t t 



Figure 2: Relationships between various metamodels. 



sal paths are fixed. If more paths are needed, more queries 
need to be generated and compiled. : . 

Metainformation 

Metainformation for an object persistence framework indudWj 
information about the application's objectihociel, ; .the: target^ 
database's data model, and the queries needed to service the 
application. As Figure 2 shows, the information is often grouped 
into the following models: 

• The data model for describing the relevant subset of the 
database schema. \ \- -\ f ■ - 

• The persistent object model for describing the persistent com- 
ponents of the business domain model. 

• The mapping model for describing the mapping between the , 
objea model and the data model 

How much detail is captured and whether the metainforma- 
tion is partitioned in one large model or various separate sub- 
models depends on issues of flexibility, efficiency, and expres- 
siveness. Therefore, there is no single correct way to package 
the information, but all the following must be captured in some 
form somewhere in the framework. . ' ■ : 

The data model represents the logical view of me database; It- • 
is a subset of the tables, views, and columns in the database schema ^ 
that are relevant to object systems/This includes information on 
entity qualifier names, logical and physical names of entities, col- 
umn datatypes, and conversions from database types to object lan- 
guage types. Further refinements could mclude information on 
database column functions such as sums and averages. - 

The data model can be augmented with information that is 
not explicitly kept in the database schema. For instance, the 
relationships implicitly defined by the foreign- key references 
in the schema can be modeled as first- class connection ob- 
jects in the data model. Enhancing the data model with con- 
nections makes the mapping of object associations to database 




Figure 3: A structural data model. 

•i j Dr. Dobb's Journal, May 1999 



( continued from page 22) 

relationships a significantly simpler task. Figure 3 presents an 
enhanced data model (the structural data model) 

It is not absolutely necessary to have a separate data mod- 
el. However, without such a model, much of this information 
must be captured in the mapping model, thus overloading its 
behavior and state. 

The persistent object model is a subset of the application's 
object model It represents only the portion of the business ob- 
ject model that requires persistence behavior. It can be a sub- 
set of classes within the complete business object model and a 
subset of the instance variables within a single class. The al- 
lowed types for the attributes can also be captured for valida- 
tion purposes. 

Besides modeling the simple attributes, the associations be- 
tween the persistent objects can also be modeled. This makes the 
object model independent from the mapping model, allowing a 
clear mapping between the foreign-key relationships in the data 
model and the object assertions in the persistent object model. 

The definition for the object identifiers can be captured in the 
object model rather than in the mapping model, again allowing 
simple mapping between the primary key column(s) in the data 
model and object identifier in the object model. : 

The persistent object model is optional, and much of the in- 
formation that it provides can be . held in the mapping model. 
However, without the object model (as well as without the data 
model) there is a risk of overloading the behavior and state of 
the mapping model. r • 

The most minimal system that would be of any interest re- 
quires at least a model of mapping between the object struc- 
ture on one side and the target database structure on the 
other. The mapping model contains the essential instructions 



Class A 



a1 

a2 



a1 a2 



Mapping attributes 
from a class to a 
single table 



Class B 



b1 
b2 
b3 



b1 b2 



b1b3 

m 



Mapping attributes 
from a class across 
multiple tables 



Class A 



V xyzd 
4 "~ W 1 



Class B 



'B* 



'A' 



Mapping attributes 
from multiple classes 
to a single table 



to the system of where the data retrieved from the database 
is to be placed in the objects. The mapping model must de- 
fine which object class corresponds to which table and which 
object attributes correspond to which columns. Refinements 
could include mapping one object to multiple tables and one 
instance variable to multiple columns, conversions of column 
data from primitive types to higher-level object types, and 
defining which columns act as database- conflict- detection 
predicates. Figure 4 shows examples of class- to- table map- 
ping schemes. 

If associations are to be supported transparently, then the map- 
ping must also define which foreign-key relationship corresponds 
to which object association in the object model. Figures 5 and 6 
illustrate various relationship- mapping schemes, 

Finally, if inheritance is supported then the mapping model 
should capture all such information. This would include the type 
of inheritance employed in the database, type discriminator val- 
ues for choosing the appropriate class, and/or foreign-key re- 
lationships between tables. Figure ? shows examples of inheri- 
tance mapping schemes. y . ^ ; - . 

Cache \- 

Various read-ahead and caching strategies can improve a per- 
sistence framework's efficiency and flexibility. Without read- 
ahead and caching capabilities, the application is alwaySiStarved 
for data, parsimoniously reading from the database as associa- 
tions in uhe persistent object model are traversed and bringing 
back data only one level at a time. With an object model that , 
has many relationships, this can cause a large number of ex- 
pensive database roundtrips.; / , : • ••■ v - ' 
A read-ahead scheme lets the application minimize the num- 
ber of database roundtrips by retrieving large object composition 



Class A 



a1 
a2 



, a1 a2b1 



Class B 



bl 
b2 



3 bl b2 



Mapping 1:1 relationship 
to a forward pointing 
reference 



Class A 



at 
a2 



a1 a2 



ClassB 



b1 
b2 



bl b2 aV 



Mapping 1 :1 relationship Q 
to a backward pointing ; 
V. reference \ : ; ; 



Class A 



al 

mm. 



#a1a2b1 b2" 



ClassB 



bt. 



Mapping 1:1 relationship 
to a single table 



Figure 4; Various class-to-table mapping schemes. 
24 " 



Figure 5- 1:1 association mapping schemes. 

Dr. Dobb's Journal, May 199.9 



trees within one query. Read-ahead involves instantiating the 
requested objects and caching the data for their related objects, 
thereby making sure that the data is present for the objects that 
are most likely needed next by the application. How far ahead 
objects are read is determined by application requirements. Flex- 
ibility is gained as the queries can be tuned without affecting the 
structure or workflow of the application. 

Reading objects ahead often results in too much data. There- 
fore, it is desirable to keep the data in binary format to delay 
or avoid the performance cost of instantiating unused objects. 
Instantiation of persistent objects is then performed. in two 
stages: First, the data is brought into the cache, then the ob- 
jects are instantiated from the cache upon demand. Leaving 
the data in a form that is smaller than a fully instantiated ob- 
ject saves space as well. 

The key to implementing the read-ahead feature is to extend 
the caching scheme to include the relationship semantics of the 
underlying database/ Database queries have fixed access paths 
that may differ from the object model navigation order. There- 
fore, the data in the cache must be organized in a fashion that 



allows dynamically composing any access paths defined in the 
database. In the case of relational databases, this means that 
the foreign- key references are extracted from the result set and 
maintained in a structured data cache. Figure 8 shows a struc- 
tured data cache. 

Registry 

To guarantee the uniqueness of the objects within the appli- 
cation's memory, each instantiated persistent object must be. 
registered into a centralized registry. The objects are usually 
identified in the registry using their persistent object identi- 
fiers; see Figure 9. 1 

As Figure 10 illustrates, when an object is retrieved using its 
object identifier the registry is searched first, then the data cache, 
and finally the database. The registry can be global if it is im- 
plemented using weak pointers, because objects are automati- 
cally removed from the registry when other objects no longer 
reference them. However, if weak pointers are not available, the 
registry must be localized. For example, transactions provide a 
good scope for local registries. 



MCIassA)^ 
at 
a2 



(Class B) 
b1 
b2 



al a2 



(Class B) 
b1 

,b2 



b1.b2.a1 




Figure & L rn association mapped to a backward pointing 
foreign-key reference. . 



-m m 



Mapping a class 

hierarchy 
to a single table 



Mapping a class 
hierarchy to root 
and leaf tables 



Mapping a class 
hierarchy to 
distinct tables 



Figure 7: Inheritance mapping schemes. 



prim.key 



instdata 



C->A 



'A1' 



'A2' 



'A3* 



'A4' 




C->B 



■cr 


'Al' 


•BV 




'C2' 


•Ar 


'B2' 




'C3' 


'az 


'B4" 


'C4' 


'A2* 


'B4' 





Data Cache 
for C-class 




• ■cr. 



•C2* 



'C3V 



•BV ! 




•B2" i 




B3' 




'B4' 







prim.key 



association frgn.key 



C-> A 



C->B 



•Ar - 1 



*A2' 



■cr 



'C2 1 



'C3' 



'C4' 



•C5" 







■cr 




'B2* 





Figure 8: Structured data cache. 
Dr. Dobb s Journal, May 1999 



25 




software ; 

configuration 

management 




^:-irrfl 

m 




version control 

multi-platform 
multi-site 



-iv.. « 



release management 
fbuild management 



impact analysis g 
\ workbench 

ide^fHT: environment 

s of t w a r e d istribution 



D I AM 0 N D 

. " A'- ^ Phone: (800) 362-8271 ■ 

(818) 224-2010 Fax: (818) 224-2009 " , " 

www.biamondOS.com 

:., nfo@DiamondOS.com 



Qu ries 

From the persistence framework's point of view, queries are the 
behavior of persistent objects on their target database. Query in 
this context means any operation supported by the target database 
and executed by the persistence framework. This includes ba- 
sic create, read, update, and delete operations; inquiries (does 
an entity exist in the database, the sum of a set of columns); 
and specific operations defined by a particular database server 
such as "balance the account.'' 

Invocation differences between different target datastores in- 
clude details such as native query representation, error handling, 
and result data interpretation and processing. The native query 
representation typically can be strings (as with dynamic SQL), 
host variables (static SQL, stored procedures), or records (main- 
frame messaging). 

Encapsulating the native query details within query objects 
can standardize target database invocation. For instance, an ob- 
ject application would never know whether the query object 
contains a SQL string, or invokes a stored procedure or a mes- 
sage to a mainframe transaction-processing monitor. Figure 11 
presents two sets of encapsulated queries targeting two differ- 
ent types of datastores. 

Queries can be grouped into two broad categories— write 
queries (SQL insert, update, and delete, for example) and read 
queries (SQL select). ; ; ■ : .- ? .i;-r rl . 

Input for write queries can be either keys/for instance, delete 
an object based on its key) or full objects (insert an object); ei- 
ther of which can be collections. Queries targeting relational 
databases operate on a single object. Queries targeting stored 
procedures or mainframe transaction-processing monitors usu- 
ally take multiple objects as input parameters. 

Write queries extract the data from persistent objects and con- 
vert it to the target database form: Depending on the datastore, 
the data is placed into a query string, a query's host variables, 
or a record structure. In the case of nested records (mainframe 
messaging), the data may also need to be recomposed accord- 
ing to the nesting structure; see Figure 12. 

Because relational write queries can operate only on one 
object at a time, the number of database roundtrips within a 
complex transaction often becomes high. A useful performance 
optimization is to group the native queries together, then send 
them to the database as one package at the end of the trans- 
action. Many relational databases support this kind of "batch" 
behavior. For procedure calls this is the typical mode of op- 
eration. 

Read queries fall into two categories— those that have no 
scope limiting conditions ("all instances" queries, for exam- 
ple) and those that require parameters for search conditions 
("finder" queries). Read queries that require parameters must 
address the same data conversion and recomposition issues 
as the write queries. 

Restructuring the resulting data is necessary when the data is 
not shaped along object lines and/or the result contains data for 




26 



Figure 9: Making objects unique using a registry: 

Dr. Dobb's Journal, May 1999 



Various read-ahead and caching 

strategies can improve 
persistence framework's efficiency 
and flexibility 



riiore than one kind of object. For example, queries involving 
certain inheritance strategies or reading ahead trees of objects 
require joins and unions that result in tuples containing data for 
multiple objects. A useful abstraction for the result processing 
is a data extractor. The data extractor contains ail the necessary 
logic to extract, convert, validate, 
and compose the data into a form 
suitable for the target persistent 
object. In case of relational joins, 
the extraction logic must also elim- 
inate redundant entries in the re- 
sult set; see Figure 13- 

To optimize the number of 
database roundtrips, the read 
queries need to be capable of 
loading trees of objects rather than 
reading one object at a time. The 
required native operations for re- 
lational queries are equijoin for 
loading chains of objects, unions 
and set differences for loading trees, 
and left- outer- joins for loading 
trees that allow missing leaves. 

Associations 

Describing the associations be- 
tween object classes is an essential element of object modeling 
and design. UML and other object modeling methodologies pro- 
vide ways of defining the semantics of associations in terms of 
their cardinality and navigability. 

The behavior of associations can be fairly complex. The im- 
plementation details can be hidden behind accessor methods 
(get methods). Accessors for one-to-one associations return the 
member object of the association. An accessor for a one- to- many 
association returns a collection of member objects. Another ap- 
proach (see Figure 14) is to implement associations as first-class 
objects (in- place association instances, proxies): 

At run time, the object referential integrity should be main- 
tained according to the semantics specified in the objects mod- 
el, while allowing the application programmer the easiest and 
most flexible interface to the relationships; Mutators (set meth- 
ods,, for example), tod collection add/remove methods should 
automatically invoke the appropriate referential integrity main- 
tenance behavior, such as updating the inverse association. 

Associations are especially important for persistent objects 
mapped to relational databases because associations can also 
provide automatic means for maintaining the database key ref- 
erential integrity. When connecting persistent objects, the asso- 
ciation will determine which persistent object holds the foreign 
key and update it appropriately with the primary key of the 
other object. Manually coding the database key maintenance 



is error prone arid can easily lead to unmaintainable code. 
Figure 15 illustrates automatic maintenance of object and 
database key referential integrity. In this example, an employee 
object is automatically removed from its old department when 
the object is added to a new department. Also, the inverse re- 
lationship from the employee to 
the department is updated au- 
tomatically. 

Associations provide a seman- 
ticaliy meaningful way for con- 
trolling the retrieval of objects 
from the database. As the appli- 
cation traverses associations, the 
related objects can be retrieved 
accordingly. Depending on the 
association, it is sometimes also 
desirable that traversal of one as- 
sociation triggers the retrieval of 
an entire graph of related objects. 
a^HBMi However, this kind of object 

graph read-ahead behavior re- 
quires advanced querying and 
caching techniques as described 
in the previous sections. 

Translation from the object as- 
sociations to the native database 
relationships may be very complex (see Figure 16). Simple re- 
lationship between two classes often translates to multiple rela- 
tionships between multiple tables when inheritance is involved. 

Transactions 

In enterprise environments a single server application may serve 
multiple concurrent client transactions, each accessing an over- 
lapping set of objects. 

Many enterprise applications that reflect complex business 
processes (see Figure 17) also require that users can navigate 
freely between different views of the user interface, work 
with the result of uncommitted changes across views, and 
commit or cancel work that has been done on a view and 
on all subviews opened in a nested fashion. In short, the na- 
ture of complex multiuser enterprise applications requires , 
that objects can be accessed from multiple concurrent and 
nested transactions. 

To ensure the consistency of concurrently running transac- 
tions they need to be isolated from each other. The two meth- 
ods for isolating the transactions are the conflict avoidance 
scheme ("pessimistic" scheme) and the conflict detection 
scheme ("optimistic" scheme). Which one to use depends on 
the type of transaction. Transactions that have a high penalty 
for failure should do whatever possible to prevent the failure 

(continued on page 30) 



Cache 



f 1 
Emp 

v. . . > 




O- 'Ai' 





Addr 
•Ai' 







Addr 


Registry 




'AV 


'AV 




f 




'A2' 






Addr ' 






A2' 


•A3* 




\ - 








Addr 






'A3* 



'AV 



'A2' 



•A3' 




Database 



•AV 



'A2' 



'A3' 



'A4' 




Figure 10: Search sequence when retrieving objects. 
Dr. Dobb's Journal, May 1999 



27 



(continued from page 27) 

(by explicitly. locking the resources as early as possible). With 
low penalty transactions it is often worth trading the risk of 
failure to gain efficiency by using a conflict detection scheme. 

The objects are copied from the database into the applica- 
tion's memory, where they may be held for extended periods 
of time. Therefore, the transaction isolation actually consists of 
rwo components: the. object level isolation within one applies 
tion, and the database level isolation across multiple applica- 
tions. Both isolation components address multiuser issues, be- 
cause one server application may also serve multiple clients, as 
in Figure 17. 

The conflict avoidance scheme for GUI- driven, long- mnning 
transactions is usually unacceptable from a performance per- 
spective. A conflict detection scheme where each transaction 
has a version of the concurrently accessed objects provides sig- 
nificantly better performance. However, managing multiple ver- 
sions of the same object can be fairly complex. 



One approach for implementing an object versioning mech- 
anism is to divide business objects into two parts: a wrapper 
and a version (for example, an EJBObject and an EntityBeaii). 
When any object refers to a business object, it actually refers to 
its wrapper. The wrapper delegates the method invocations to 
the appropriate version, which contains the object's business 
behavior and instance data. When a business object is first ac- 
cessed (get/ set a property) within a transaction, a new version 
of the object is added to the current transaction's local registry. 
The new version is based on the version in the parent transac- 
tion's registry. Figure 18 shows multiple object versions within 
a tree of nested transactions. 

Upon commit, the versions in a child transaction's registry are 
merged with its parent transaction's corresponding versions. If 
the transaction is a top-level transaction, the versions are also 
written into the database. The logic for detecting and resolving 
conflicts on merge is highly application dependent. The test may 
■ be as simple as comparing parent and child version numbers in 



update address set streetno=34, ". . . where custno=456 and streetno=56 



Example 1: Update statement with conflict detection predicates. 



Relational , 

query set ^UUUy/ 



>-n >-n p< - 4 Procedure call 
.UUUU/' query set 




X 



o°o 
oHo 



Figure 11: Two sets of encapsulated queries. 




cm 




Figure 12: Recomposing object instance data according to a record- nesting structure. 

30 Dr. Dobb's Journal, May 1999 



order to determine if the parent version has been changed af- 
ter the child version was created. For more advanced application- 
dependent testing the wrapper could have a conflict resolution 
call back method. 

On rollback the child versions are simply dropped instead of 
having to restore object states in the parent transaction. After 
rollback there is no trace that ei- 
ther the child transaction or the 
child versions ever existed. 

Many relational databases pro- 
vide little support for row- level 
conflict avoidance. With most 
databases the row- level locking 
is available only in conjunction 
with cursors. However, cursors 
may be of little use for an ob- 
ject application that is accessing 
and holding onto large numbers 
of different types of objects in a mmmmma^ma^m 
random fashion. One trick for 
acquiring a row^ level lock with- 
out a cursor is to touch a corre- 
sponding row (update a column 
without changing its value, for 

instance) when an object is first accessed within a transac- 
tion. If the row is already locked, the desirable action is of- 
ten to raise an exception instead of waiting for the lock to be 
released. 

As with object level isolation, the logic for detecting and re- 
solving database conflicts is application dependent. The two 
common conflict detection methods are either to reread and 
compare the database row to the modified object, or to add col- 
lision detection predicates (a set of attributes that constitute a 
conflict) to the where clause of the database update statement. 
Example 1 demonstrates conflict detection predicates. The up- 
date statement will fail if another user has changed the street 
number from its old value. 

Rereading and comparing rows is expensive and should be 
used sparingly, because it requires multiple database 
roundtrips — locking, reading, and updating the row. On the 
other hand, the use of conflict detection predicates is 
lightweight and works fine in most situations. More sophisti- 
cated detection schemes can be composed of combinations 
of the aforementioned commands. 



Serialization has been presented 
as a method for providing simple 
object persistence 



Most commercial databases have referential integrity (RI) 
constraints for , maintaining the consistency of the database. 
These constraints require the database's store and delete op- 
erations to be executed in a specific order. This order does 
not necessarily match the order in which the objects are cre- 
ated or deleted within an object application. Furthermore, the 

database RI constraints do not 
map to the logical object asso- 
ciations in a consistent way. Rl 
rules are enforced based on the 
foreign- key references, which 
may have more than one possi- 
ble transformation when 
mapped to object associations. 
Manually coding the operation 
ordering is time consuming arid 
error prone, easily leading to un- 
maintainable code. It is prefer- 
^■■"■■■■^"i able to defer execution of the 

operations and let the transac- 
tion automatically decide the or- 
dering upon its commit. 

The ordering algorithm utilizes 
the information of how the ob- 
ject associations are mapped to the primary- key/foreign- key 
column pairs in the database, and the integrity rules defined 
for the key columns. For each object within the transaction, 
the algorithm iterates over the associations the object has with 
other objects. For each association, the algorithm. tests if the 
object has either insert precedence (if the object is to be in- 
serted) or delete precedence (if the object is to be deleted) 
over the association. If the object has a higher precedence, 
it will be moved accordingly in the transaction's participant 
list. Due to the nature of relational RI constraints, the algo- 
rithm remains fairly simple, because there cannot be circular 
constraints defined in the database (otherwise it. would be 
impossible to insert a row that has a prerequisite to its own 
prerequisite). 

API 

From the programming and maintenance point of view, the 
number of persistent constructs that appear in the application 
code should be kept as low as possible. Having a low number 
of persistence constructs introduces minimal intrusion upon. 



Query 



Data 
Cache 



0 



Data 
Cache 



O-pu 







f """ "\ 




r ~s 


Data 




Data 




Data 


Extractor 




Extractor 




Extractor 


v J 




v- J 




k ... , ) 



A1 



cm 




\ C112 




C121 




C122 





Data 
Cache 



A1 




B11 


1H 0111 




A1 


. 


811 Hi 


C1 12 [sBflEBnHEi 


A1 j^^^^ 


B12 


»m 


C121 




A1 


HIS B12 




C122 


]:■: ■)■ ■. 
1 , I ' ' 



811 



— B12 




Figure 13: Restructuring a relational result set and eliminating redundant entries. 
f>r Dobb's Journal, May 1999 



31 



the application, thus allowing the database and application to 
remain loosely coupled. This loose coupling between the 
database and application lets you design an object model that 
models the application domain as opposed to modeling the 
database design and vice versa. The persistent framework must 
be intelligent enough to perform many of the necessary per- 




Figure 14: Associations implemented as first-class objects. 



sisting processes automatically, without instruction from the 
application. Implementing persistent constructs as first-class 
objects and providing some of the persistence metainforma- 
tion at run time are two of the keys that make a successful 
persistence framework. The interfaces provided by the per- 
sistence API can be grouped into the following categories: 

• Business object interface. Protocol for accessing attributes from 
the business object. 

• Life cycle interface. Protocol for creating and destroying busi- 
ness object instances. 

• Finder interface. Protocol for finding business object instances. 

• Transaction interface. Protocol for creating, committing, and 
rolling back transactions. 

For example, the Enterprise JavaBeans (EJB) Specification de- 
fines interfaces that correspond to these categories.. The remote 
interface for entity Beans corresponds to the business object in- 
terface. The EJB home interface has the same responsibilities as 
the life cycle and finder interfaces. The transaction interface is 
provided by the UserTransaction in me Java transaction pack- 
age, which is one of the prerequisites for the EJB. 



Dept 
■DI- 



CK 




emove(emp) (?) 



L ; set Dept(dept2) ($) - 
: / set Dept Frgn Key("D2') 



(T) add(emp) 



Emp 




Figure 15: Automatic maintenance of object and key referential integrity. 




Figure 16: Complex translation from an object association to multiple database relationships. 



Transaction tx = Transaction new(); 

EmployeeHomelmpl empHome = EmployeeHomelmpl. singleton () ; 

Employee emp; ; • '"„}-. . ■- ■ 

AddressHomelmpl addrHome = AddressHomelmpl . singleton () ; ; ^ : '■■ 

Address addr: • ' V£f. 

tx.begin(); //begin a new transaction, (transaction interface) 

emp = empHome.nndByKey("1234") ; //find an employee instance (finder interface) 

addr = addrHome. create () ; * //create an address instance (factory interface) 

addr . setStreet (" 123 Somewhere Dr."); //set attributes of the address (bus. object interface) 



emp. setAddress (addr) ; 
tx. commit 0; 



//set employee's address (bus. object interface) 
//commit the changes (transaction interface) 



Example 2: Sample persistence API code. 
32 



Dr. Dobb's Journal, May 1999 




Figure 17: A typical system configuration in an enterprise 
environment. 




Figure 1R Multiple object versions within a tree of nested 
transactions. 

Example 2 demonstrates the use of the persistence API by re- 
trieving an employee object, creating an address object, associ- 
ating these two objects together, and committing the changes 
to the database. 

C nclusion ^ 

The rationale for building an object persistence framework are, 
of course, increased productivity and reduced maintenance costs. 
Independence between object applications and databases allows 
enterprises to develop and maintain more complex applications 
and still leverage existing data management infrastructures. 

Implementing a full-blown object persistence framework; 
easily represents several years worth of work. The more flex- 
ibility and performance that is required from the framework, 
the more complex the framework becomes. Yet almost any 
framework is better than no framework. Even a simple frame- 
work can help in structuring the code in a clean and logical 
way. For example, the mapping metainformation can implic- 
itly be represented as inlined code and the query objects can 
encapsulate handcrafted SQL strings. The areas worth spend- 
ing more time in creating generic components, however, are 
the associations and the transactions because they have a di- 
rect impact on the application programming model. There are 
also several commercial object persistence frameworks avail- 
able that are usually a viable alternative to in- house devel- 
opment, especially when the target application is complex and 
critical to the enterprise. 



t>DJ 



B CICiLS 



LEADing 
Technology 
in Imaging 
Development 
Toolkits. 



Just an Ad 
is Not Going 
to Cut ill 

With over 1000 
features, more than 
any other toolkit on 
the market, visiting 
our website is the only 
way you can see just 
how powerful this 
award winning imag- 
ing toolkit is! -i • 

Hit the web 
and check out 

IMAGE PROCESSING 

SCMHING 
COLOR COIfWERSION 
OISPUY/SPECIAI EFFECTS 

MHQTOIOHS 
i eOMPtjESSIOK 1 
IMA6III6 COMMON 0IA1OG 

. iinnMfiMraANCT j 

BAWBASIt^ 1 



!nbmi| 



LEADTOOLS is available in wvw ai versions, 
not aH features ara available in ait vercjora. 
'lic«n»e required from Unisys for formats using 
LZW compression. LEAD and LEADTOOLS 
ara registered trademarks of LEAD 
Technologies. Inc. AB other product names are 
m ol their respective owners. 




;C + +class^ 
library 

|c-k+ class 



"< ^'?P^Bfe^lv^ T ' ActiveX 
\ IMAGING ActiveX 
IV1 ULTIiVI EDIA 
DOCUMEWT 

LEADTOOLS 

IMAGING DEVELOPMENT 



FILE FORMATS 

- MORE THAN 50 - 

MOST COMPREHENSIVE 
SUPPORT AVAILABLE 
AND LOSSLESS JPEG! 



IPE6 


lOOfl 


DIB 


PCT 


TIFF 


MODCA 


WFX 


CMP 


DICOM 


CM 


MAC 


BMP 


FPU 


ICO 


VDA 


AWD 


OF 


CUR 


cir 


WMF 


PSD 


PCX 


PNfi 


IMF 


PCD 


OCX 


TGA 


WPG 


EPS 


IMG 


HAS 


AVI 



for the mil list ot 
File Formats and Features, 
please visit our website: 

LEADTOOLS supports both 16 
and 32-bit development 
environments, and ships with sample 
source code for Visual Basic, C/C++, 
Visual C++ (MFC), C++ Builder, 
Visual J++, Visual FoxPro, Access, 
Delphi, and VB and Java script. And 
NEW support for Visual Studio 
database connectivity using OLE DB 
(JET, ODBC, Oracle and SQL 
Server) 

Includes Free Technical 



800-637-1 &40 

30-DAY MONEY BACK GUARANTEE 



www. TO-OLS*com. 

Demos, evaluations, online ordering and registration 



Dr. Dobb's Journal, May 1999 



33 



